IEEE Journal of Biomedical and Health Informatics
● Institute of Electrical and Electronics Engineers (IEEE)
Preprints posted in the last 90 days, ranked by how well they match IEEE Journal of Biomedical and Health Informatics's content profile, based on 34 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.
E, S.; Wang, C.; Rao, T. D.; Kumar, T. S.
Show abstract
Major depressive disorder (MDD) is a common psychiatric disorder that requires reliable and objective assessment for early clinical intervention. Electroencephalography (EEG) is widely used for this purpose because it provides a non-invasive and low-cost measure of brain activity with high temporal resolution. However, EEG-based depression detection remains challenging due to the nonlinear nature of EEG signals, inter-subject variability, and the limited availability of subject-independent evaluation. To address these issues, this paper proposes a hybrid quantum-classical multiscale long short-term memory with parameterized quantum circuit branches (MS-LSTM-PQC) framework for subject-level EEG-based depression detection. The proposed model extracts temporal representations at multiple scales using parallel LSTM branches and incorporates eyes-closed (EC) and eyes-open (EO) condition information through condition-aware feature fusion. To further enhance the learned representations, scale-specific LSTM features are processed using PQC-based quantum branches implemented with TensorFlow Quantum (TFQ), providing an additional nonlinear feature transformation before classification. Experiments were conducted on the Mumtaz EEG depression dataset using EC-only, EO-only, and merged EC+EO conditions with 1-s, 2-s, and 3-s EEG windows. To reduce subject-level data leakage, all experiments were evaluated using 5-fold and 10-fold GroupKFold validation. The best overall accuracies across the evaluated settings were 92.05% and 95.08% under 5-fold and 10-fold GroupKFold validation, respectively. The 2-s merged EC+EO setting provided the most stable performance across validation protocols. In addition, Integrated Gradients (IG)-based explainability analysis showed that frontal and fronto-central channels, especially Fz, showed higher contributions to the model decision. These results suggest that multiscale temporal learning with quantum-enhanced feature transformation can support subject-level EEG-based depression detection under leakage-controlled evaluation.
Kurt, F.; Subasi, S. N.; Yakisan, E. S.; Subasi, A.
Show abstract
Background: Wearable technologies enable scalable and continuous monitoring of emotional states through passive sensing of physiological and behavioral signals. However, conventional learning approaches often struggle to model the complex temporal, contextual, and relational dependencies underlying human emotions. To address these limitations, we propose a graph-based framework that represents multimodal wearable observations as heterogeneous knowledge graphs enriched with semantic information derived from Large Language Models (LLMs), enabling richer contextual understanding beyond raw sensor measurements. Methods: We constructed a heterogeneous knowledge graph using multimodal Fitbit physiological signals and affective self-report data collected from 45 users. Framing mood prediction and emotion detection was formulated as both binary and ternary node classification tasks. We evaluated five baseline heterogeneous Graph Neural Network (GNN) architectures and compared them with the proposed Semantically Gated Augmented Graph Neural Network (SeGA-GNN) framework, which dynamically integrates LLM-generated semantic embeddings into graph representations through a gated cross-modal fusion mechanism. Results: The baseline GNN models achieved strong performance, with classification accuracies ranging from 0.7525 to 0.9739 for binary classification and 0.6249 to 0.9699 for ternary classification. The proposed SeGA framework consistently improved predictive performance across most architectures. In particular, semantic augmentation transformed the HAN model from moderate baseline performance into near-perfect emotion recognition capability, achieving SeGA-HAN Accuracy = 0.9988 and AUC = 1.0000 for binary classification and Accuracy = 0.9979 and AUC = 1.0000 for ternary classification. Discussion and Conclusion: Integrating LLM-derived semantic contextualization into heterogeneous graph learning enables effective modeling of contextual information that is not directly captured by wearable physiological signals alone. The proposed SeGA-GNN framework demonstrates that adaptive semantic fusion substantially improves the accuracy, robustness, and interpretability of wearable-based emotion detection. These findings establish a promising direction for next-generation wearable affective computing systems and intelligent emotion-aware applications.
Zoofaghari, M.; Rahaimifard, A.; Chatterjee, S.; Balasingham, I.
Show abstract
Goal-oriented semantic communication has recently emerged in wireless sensor-actuator networks, emphasizing the meaning and relevance of information over raw data delivery, thereby enabling resource-efficient telecommunication. This paradigm offers significant benefits for intra-body or implantable sensor-actuator networks, including dramatic reductions in bandwidth requirements, latency, and power consumption. In this paper, we address a patch-based energy-efficient anomaly detection method for smart capsule endoscopy. We propose a deep learningbased algorithm that employs the similarity between features extracted from measured images and a reference (normal) image as the detection metric. The algorithm is evaluated using a clinical dataset of capsule-captured images, combined with a simulated intra-body channel model. The results demonstrate that even with only 60% of the transmission power (relative to a standard link design for QPSK modulation) and 65% of the light intensity, the probability of anomaly detection remains above 85%, and it gradually improves as power and illumination levels increase. This improvement translates into a potential battery life extension of over 43%. The findings highlight the potential of semanticaware, energy-efficient intra-body devices for more sustainable and effective medical interventions.
Sharbaf, S.
Show abstract
Brain tumor detection using Magnetic Resonance Imaging (MRI) remains a challenging task due to tumor heterogeneity and imaging variability. This paper presents a novel hybrid Deep Convolutional Neural Network-Whale Optimization Algorithm (DCNN-WOA) framework for automated brain tumor detection and classification. The proposed method consists of four main stages: MRI data preprocessing and augmentation, deep feature extraction using multi-layer Convolutional Neural Networks (CNN), feature selection and hyperparameter optimization via the Whale Optimization Algorithm (WOA), and final classification with comprehensive performance evaluation. By jointly optimizing deep features and training parameters, the framework effectively reduces feature redundancy, accelerates convergence, and enhances model generalization. Experimental results on a publicly available MRI dataset demonstrate that the DCNN-WOA model outperforms conventional CNN and state-of-the-art Deep Learning (DL) architectures, achieving an accuracy of 97.8%, sensitivity of 96.4%, specificity of 98.1%, and F1-score of 97.2%. The practical impact of this approach makes it a promising solution for real-time clinical decision-support systems in neuroimaging.
Sakurai, R.; Kojima, S.; Otake-Matsuura, M.; Kanoh, S.; Rutkowski, T. M.
Show abstract
Traditional psychiatric assessments for depression are often hindered by subjective bias and patient recall in-accuracy. This paper presents a multimodal passive Brain-Computer Interface (pBCI) designed for the objective screening of depressive traits through the end-to-end decoding of neural dynamics. We implemented a hybrid EEG-fNIRS framework to capture synchronized electro-hemodynamic responses during an emotional working memory (EWM) task. To classify sub-clinical depressive tendencies based on BDI-II scores, we utilized SincShallowNet, a deep learning architecture optimized for raw signal processing via learnable Sinc-filters. Our results demonstrate that the pBCI achieves peak performance in the auditory modality, with the integration of EEG and low-pass filtered fNIRS (0.15 Hz) yielding a balanced accuracy of 90.9% and an F1-score of 0.867. By isolating purely endogenous neural markers during the EWM maintenance phase, the system provides a robust "silent observer" for mental state monitoring. These findings validate the potential of multimodal pBCIs as high-precision, data-driven tools for early-stage depression screening, offering a scalable alternative to traditional clinical interviews and a foundation for longitudinal mental health monitoring.
Ma, Y.; Chinthala, L.; Mohammed, A.; Davis, R. L.; Colonna, V.
Show abstract
Rare diseases are characterized by heterogeneous, weak, and sparse phenotypic signals that emerge gradually across longitudinal clinical visits, making early detection a persistent challenge. In this study, we propose a hierarchical set-to-sequence (HSS) framework for prospective rare disease detection using structured EHR data. HSS decomposes the problem into two levels: (1) intra-visit encoding via Multi-Query Attention (MQA), which treats heterogeneous clinical events within a single clinical visit as an unordered set to generate unified visit-level representations, and (2) inter-visit temporal modeling with transformer encoders conditioned on patient visit age and inter-visit time gaps to capture the disease progression and the irregular intervals between clinical visits. We construct a real-world cohort of 40,223 patients comprising 708,422 visits from a single academic medical center (2005-2025), with 3,032 rare disease cases identified by curated rule-based phenotyping including severe neuro-developmental, congenital, or genetic conditions. We formulate the task as multi-horizon prospective binary classification with five prediction horizons of 7, 30, 90, 180, and 365 days prior to first diagnosis. Experimental results show that the proposed HSS model consistently outperforms linear logistic regression, tree-based XGBoost, and Transformer-based baselines at every prediction horizon, ranging from AUROC = 0.893 and AUPRC = 0.601 at 7 days with 5.17% prevalence to AUROC = 0.829 and AUPRC = 0.228 at 365 days with at 3.98% prevalence. Notably, the performance gap between HSS and the strongest competing baseline is largest at the 365 days horizon, indicating stronger advantages for long-horizon prediction where phenotypic signals for rare diseases are weak and sparse. Additional analyses further clarify the contribution of the hierarchical components and confirm the importance of hierarchical modeling. This work contributes to the ongoing development of AI methodologies tailored to rare diseases by introducing a hierarchical framework for early detection using structured longitudinal clinical data.
Shah, A.; Mehta, A.; Bhensdadia, C. K.
Show abstract
Mental health challenges among university students have increased due to academic pressure, lifestyle changes, and continuous digital engagement. Existing approaches for mental health assessment often rely either on self-reported psychological scales or isolated behavioral indicators, limiting their ability to capture complex temporal and contextual patterns. This study proposes an interpretable multimodal framework for student mental health risk assessment using behavioral sensing, academic information, ecological momentary assessments (EMA), and psychometric survey data. A bidirectional Long Short-Term Memory autoencoder is employed to learn latent temporal representations from day-level behavioral sequences, while graph embeddings capture structural relationships among students using similarity-based neighborhood graphs. These representations are fused with academic and survey-derived features and reduced using Principal Component Analysis and Uniform Manifold Approximation and Projection. K-means clustering is then applied to identify behaviorally distinct student groups. Experimental analysis on the StudentLife dataset demonstrates meaningful clustering performance with a Silhouette Score of 0.4209 and Adjusted Rand Index stability of 0.6869. The identified clusters correspond to low-risk, moderate-risk, and high-risk behavioral profiles. To improve interpretability and practical usability, a fuzzy inference system is introduced to compute mental risk, academic risk, and wellbeing indices using psychometric indicators including PHQ-9, PSS, PANAS, VR-12, and Big Five personality traits. The results demonstrate the potential of combining multimodal behavioral modeling with interpretable fuzzy reasoning to support early mental health risk assessment in educational settings.
Addepalli, V. r.; Rao, P.; Kiselica, A.; Kummerfeld, E.; Abdalnabi, N.; Lee, K.
Show abstract
Monitoring activities of daily living (ADLs) in the home is a promising approach for tracking dementia progression in older adults. While ambient sensor-based ADL systems are well-studied, most existing ADL recognition systems rely on globally trained models that ignore the spatial organization of in-home activities. In real deployments, where training data are sparse and highly home-specific, global transformer models may fail to capture room-dependent behavioral structure. We propose a deterministic Mixture of Experts (MoE) architecture for in-home ADL recognition, in which each expert is a compact transformer specialized to one room of the home (bedroom, kitchen, bathroom, living area). Input segments are routed using a deterministic gating strategy based on room-level motion activity and time-of-day priors for sleep-related behaviors. Unlike learned routing networks, the proposed gate encodes domain knowledge about where ADLs are likely to occur, reducing model complexity under limited per-home training data. By decomposing ADL recognition into room-specific activity spaces, the proposed architecture reduces competition between dominant and low-frequency activities under highly imbalanced residential data. We evaluated the system on data collected via low-cost ambient sensors (motion, light, temperature, humidity) and Raspberry Pi edge devices across five homes, with ground-truth ADL labels provided by participants and caregivers. Across the five homes, the proposed MoE consistently outperformed global transformer, 1D CNN, and Random Forest baselines, achieving macro-F1 scores ranging from 0.60 to 0.88, highlighting the importance of home-specific modeling in real-world deployments. These findings suggest that room-aware expert specialization may provide a practical and interpretable strategy for low-data ADL recognition in real-world residential environments.
Su, H.; Fan, W.; Peng, J.; Zhang, Y.
Show abstract
High bit-depth medical images preserve subtle intensity variations that are important for quantitative analysis and clinical interpretation, but their large dynamic range poses challenges for efficient compression. We propose a bit-plane-aware dual-stream compression framework for 16-bit medical images by separately modeling the most significant bit (MSB) and least significant bit (LSB) components. The MSB structural stream is encoded using JPEG coding with a Duplicate Segment Skipping (DSS) strategy to exploit spatial and segment-level redundancy, while the LSB detail stream is compressed using learned image compression to represent residual variations and fine-grained details. Experiments on four MRI and CT datasets show that the proposed method consistently outperforms representative traditional and learning-based codecs, achieving the lowest bit rate across all datasets. Meanwhile, it preserves high reconstruction fidelity. As a downstream application, we further demonstrate that the compressed bitstreams can be effectively integrated with DNA encoding and converted into sequences with favorable biochemical properties.
chen, w.; Yang, X.; Lu, J.; Miao, M.; Huang, Y.; Zheng, S.; Zhang, C.; Xie, L.; Zhang, Y.
Show abstract
Whole-body SPECT bone scintigraphy reflects skeletal metabolic activity throughout the body and plays an indispensable role in the screening, treatment evaluation, and prognostic assessment of bone metastases in tumors. However, the automatic detection and segmentation of hypermetabolic bone lesions remain challenging due to low contrast, limited spatial resolution, and complex lesion distributions. In this study, we proposed Bone-Segnet, a dual-view guided automatic segmentation network for hypermetabolic bone lesions that integrated multi-scale feature modeling, global context modeling, and view-conditioned modulation. Pixel-level annotated anterior and posterior whole-body bone scintigraphy images were used for model training and prediction. The proposed network enhanced the recognition of low-contrast and small-scale lesions through small-lesion enhancement and multi-scale contextual modeling. A Transformer module was further introduced to strengthen global feature representation, while cross-view collaborative modeling was achieved by incorporating the complementary characteristics of anterior and posterior imaging. Experimental results demonstrated that the proposed method outperformed existing approaches across multiple evaluation metrics, with the Dice score improving from 0.7440 to 0.8750, indicating a substantial improvement in segmentation performance. Further quantitative analysis based on the segmentation results revealed significant differences among disease types in lesion count, pixel burden, and spatial distribution patterns, reflecting the heterogeneity of disease-related skeletal metabolic activity. Overall, the proposed method improved automatic lesion segmentation performance and enabled quantitative analysis of lesion burden and spatial distribution patterns, providing objective data support for the assessment of related diseases. Index Terms--Whole-body SPECT, bone lesion segmentation, dual-view modeling, quantitative analysis.
Huang, X.; Hsieh, C.; Nguyen, Q.; Renteria, M. E.; Gharahkhani, P.
Show abstract
Wearable-derived physiological features have been associated with disease risk, but most current studies focus on single conditions, limiting understanding of cross-disease patterns. This study adopts a trans-diagnostic approach to examine whether wearable data capture shared and condition-specific physiological signatures across multiple chronic conditions spanning physical and mental health, and then evaluates the utility of these features for disease classification. A total of 9,301 patients with at least 21 days of consecutive FitBit data from the All of Us Controlled Tier Dataset version 8 were analyzed. Disease subcohorts included cardiovascular disease (CVD), diabetes, obstructive sleep apnea (OSA), major depressive disorder (MDD), anxiety, bipolar disorder, and attention-deficit/ hyperactivity disorder (ADHD), chosen based on prevalence and relevance. Logistic regression and XGBoost models were fitted for each disease subcohort versus the control cohort. We found that compared to using just baseline demographic and lifestyle features, incorporating wearable-derived features enabled improved classification performance in all subcohorts for both models, except for ADHD where improvement was mainly observed for ROC-AUC in logistic regression model likely due to the smaller sample size in ADHD subcohort. The largest performance gains were observed in MDD (increase in ROC-AUC of 0.077 for Logistic regression, 0.071 for XGBoost; p < 0.001) and anxiety (increase in ROC-AUC of 0.077 for logistic regression, 0.108 for XGBoost; p < 0.001). This study provides one of the first comprehensive transdiagnostic evaluations of wearable-derived features for disease classification, highlighting their potential to enhance risk stratification in the real-world setting as a practical complement to clinical assessments and providing a foundation to explore more fine-grained wearable data. Author summaryWearable devices such as fitness trackers and smartwatches are becoming increasingly popular and affordable, providing continuous measurements of heart rate, physical activity, and sleep. Alongside the growing digitization of health records, this creates new opportunities for large-scale, real-world health studies. In this study, we analyzed wearable-derived physiological patterns across a range of chronic conditions spanning both physical and mental health to better understand how these signals relate to disease risk. We found that incorporating wearable-derived heart rate, activity and sleep features improved disease risk classification across several conditions, with particularly strong gains for major depressive disorder and anxiety. By examining how individual features contributed to model predictions, we also identified meaningful associations between physiological signals and disease risk. For example, both duration and day-to-day variation of deep and rapid eye movement (REM) sleep were associated with increased risk in certain conditions. Our study supports the development of real-time, automated tools to assess disease risk alongside clinical care.
Chen, P.-W.; Cielo, C.; Walsh, O.; Mcdonald, M.; Song, P. X.; Goldstein, C.; Moreno, J. P.; Jansen, E.; Mitchell, J. A.
Show abstract
Introduction: Actigraphy sleep-wake classification methods increasingly seek to leverage raw acceleration data and machine-learning-based classification, but performance evaluation in pediatrics is limited. We trained machine-learning models using pediatric data and compared their sleep-wake classification performance with existing algorithms for children. Methods: Sixty-five children (46% female, ages 5.3 to 17.7 years) completed in-lab overnight polysomnography and wore a GENEActiv device on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Seven machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics (e.g., balanced accuracy [BA]) and discrepancy analyses provided overall sleep duration bias estimates. The combination of highest performance and least bias was used to rank using Euclidean distance scores - where a lower score represents closer to perfect performance and zero bias. For benchmarking, we included GGIR sleep scoring algorithms and an adult trained random forest classifier. Results: Overall, 560.1 hours of polysomnography and actigraphy data were collected (74.4% of epochs were scored as sleep). The pediatric-trained local-global long-short term memory (LSTM) classifier had the most optimal epoch-by-epoch performance (e.g., BA=0.85, sensitivity=0.88, specificity=0.83, ROC-AUC=0.95, and Cohen kappa=0.67). These metrics exceeded that of an adult-trained random forest classifier and GGIR-based algorithms. Discrepancy analyses revealed that overall sleep duration was underestimated by an average of 25 minutes using the LSTM classifier with no proportional bias. Conclusion: We trained seven pediatric sleep-wake classifiers that had strong ability to detect sleep and wake, with the LSTM classifier being most optimal.
Chen, Z.; Wu, R.; Liu, Y.; Li, R.; Duprey, A.
Show abstract
The integration of Large Language Models into high-stakes clinical workflows is critically hampered by their lack of verifiable reliability and tendency to generate hallucinations. This paper introduces Med-ICE, an autonomous framework designed to enhance the reliability of LLMs for medical applications. Med-ICE adapts the Iterative Consensus Ensemble paradigm, enabling a group of peer LLM agents to collaboratively converge on a final answer through iterative rounds of generation and peer review, thereby eliminating the need for an external arbiter and its associated scalability bottleneck. Our work makes three key contributions: (1) a novel semantic consensus mechanism that determines agreement based on semantic similarity, crucial for nuanced clinical language; (2) demonstration of state-of-the-art performance, where Med-ICE significantly outperforms both direct single-LLM generation and the Self-Refinement technique on challenging medical benchmarks; and (3) a highly efficient and scalable architecture, as our Semantic Consensus Monitor is computationally lightweight. This research establishes a new standard for developing safer, more trustworthy LLM systems, paving the way for their responsible integration into medicine.
Baroud, S.
Show abstract
Migraine detection and sentiment analysis in healthcare have become increasingly important, particularly with the rise of social media platforms like Twitter, where users often share their personal health experiences. This study presents MASHA (Multi-Agent System for Healthcare Sentiment Analysis), an artificial intelligence (AI)-driven framework that integrates multiple machine learning (ML) models for sentiment analysis of Arabic tweets related to migraines. The system leverages a multi-agent architecture to handle tasks such as data acquisition, pre-processing, model training and real-time decision-making. Key ML models, including Support Vector Machines (SVM), Naive Bayes (NB) and Logistic Regression (LR), are integrated using ensemble techniques, leading to improved classification performance. Experiments conducted on a dataset of Arabic tweets demonstrate that MASHA outperforms traditional methods, achieving an accuracy of 90.0% and an F1-score of 89.46%. Moreover, the system's scalability and flexibility make it suitable for real-time public health monitoring, offering valuable insights into patient experiences and public sentiment regarding healthcare services. MASHA's adaptability suggests its potential application for analysing other healthcare-related conditions, reinforcing the system's scalability and broader relevance. Future work will focus on incorporating deep learning (DL) models and expanding the dataset with content from additional social media platform.
Ogretir, M.; Kaipainen, V.; Leskinen, M.; Lahdesmaki, H.; Koskinen, M.
Show abstract
Neonates requiring intensive care are at increased risk for long-term neuropsychiatric disorders. However, clinical adoption of risk prediction models remains limited when their performance lacks adequate interpretability for informed clinical decision-making. Here, we investigated whether longitudinal neonatal electronic health record (EHR) data from the first 90 days of life can support clinically meaningful interpretation of long-term risk signals for major neuropsychiatric diagnoses by age seven. In a retrospective register-based cohort of 17,655 at-risk children from an academic medical center, of whom 8.0\% (1,420) received a major neuropsychiatric diagnosis during follow-up, we applied a time-aware transformer model (Self-supervised Transformer for Time-Series; STraTS) and thoroughly evaluated its predictions using three complementary interpretability approaches: perturbation-based variable importance, value-dependent effect analysis, and leave-one-out (LOO) feature attribution. STraTS achieved the highest area under the precision--recall curve (AUPRC 0.171 {+/-} 0.022), compared with Random Forest (0.166 {+/-} 0.008), logistic regression (0.151 {+/-} 0.007), and XGBoost (0.128 {+/-} 0.010). Across interpretability methods, five predictors were consistently identified: birth weight, gender, Apgar score at 1 minute, umbilical serum thyroid stimulating hormone (uS-TSH), and treatment time in hospital. Indicators of early clinical severity, including chromosomal abnormalities and neonatal cerebral-status disturbances, showed the largest risk-increasing effects. Furthermore, the model's learned vector representations of subject-specific EHR sequences formed clinically coherent latent embeddings that reflect population heterogeneity along established perinatal risk dimensions. These findings demonstrate that combining multiple complementary interpretability methods yields stable, clinically plausible risk signals while revealing limitations that would remain undetected by any single approach, highlighting the importance of careful interpretability analysis of deep learning-based risk predictions.
Specht, B.; Tayeb, Z. Z.; Garbaya, S.; Khadraoui, D.; EL-Khozondar, M.; Schneider, R.
Show abstract
Accurate inference of physiological state across the menstrual cycle has important applications in reproductive health and in understanding symptom dynamics, yet most non-hormonal approaches rely on wearable sensors or calendar-based tracking. Whether self-reported symptoms alone can support prospective, cross-subject phase classification remains unresolved. Here, we introduce a hybrid modelling framework that combines a gradient-boosted classifier with a Hidden Semi-Markov Model to infer four menstrual cycle phases (menstrual, follicular, fertile, and luteal) from self-reported data. The classifier captures non-linear symptom patterns, while the temporal model imposes biologically grounded constraints, including cyclic ordering and realistic phase durations. In a leave-one-subject-out evaluation using hormonally annotated data from 41 participants, the model achieved 67.6\% accuracy and a macro F1 score of 0.662. Features reflecting short-term symptom variability were more informative than absolute symptom levels, indicating that within-person fluctuation provides a more generalisable signal of cycle phase than symptom intensity alone. These findings demonstrate the feasibility of low-burden, device-free menstrual health monitoring, establish symptom dynamics as a basis for scalable digital biomarkers, and expand access to tracking in resource-constrained settings and populations underserved by wearable-based approaches.
Rahjouei, A.
Show abstract
Actigraphy is widely used for long-term sleep monitoring, but established sleep-wake scoring algorithms often require parameter tuning, which is commonly performed manually and can reduce reproducibility. In this study, a grid-search-based calibration framework is presented for established actigraphy algorithms and evaluate whether it can serve as a practical alternative to manual tuning. The method was evaluated using two datasets: a multi-subject polysomnography-validated actigraphy dataset and a self-collected dual-device dataset. In the polysomnography-validated dataset, grid-search optimization produced performance patterns similar to manual parameter selection, while slightly improving detection of sleep onset and sleep offset and yielding modest gains in wake-sensitive metrics. In the dual-device dataset, consensus and majority voting were useful for reducing the influence of brief wake episodes occurring within the main sleep period, including micro-awakenings that can fragment sleep predictions across individual algorithms. Overall, these findings show that grid-search can replace manual parameter tuning with a more explicit and reproducible procedure while providing small improvements in sleep timing estimation and benefiting ensemble-based handling of within-sleep wakefulness.
Ge, Z.; Liu, S.; Dou, W.
Show abstract
Background and ObjectiveNormative modeling is a key tool for understanding brain alterations in neurodegenerative diseases, such as cerebellar-type multiple system atrophy. However, existing methods lack interpretability and fail to capture clinically meaningful pathological changes. This study presents DINMC, a Deep Interpretable Normative Model Construction framework, which combines autoencoder-based learning with statistical hypothesis testing to better capture and interpret disease-specific neu-roanatomical changes. MethodsThe DINMC framework constructs normative models using neuroimaging data from multi-site large healthy cohorts. It utilizes a U-shaped convolutional autoencoder to train these models, which are then applied to reconstruct brain features from both patients and healthy controls within the same study cohort. Pathological confidence values are derived by fusing original and deviation feature spaces, offering a measure of disease-related pathology reflected in each dimension of the features. The framework was validated through statistical analysis and prognostic classification and regression tasks. ResultsThe pathological confidence provides valuable insights into the neuroanatomical regions most affected by the disease, as well as the correlation between changes in these regions and clinical assessment scales. Our optimal model outperform traditional methods in prognostic prediction tasks, with an AUC of 0.972 for classification tasks and an R2 of 0.432 for regression tasks. ConclusionDINMC provides a novel and interpretable framework for neuroimaging analysis. By combining deep learning and statistical hypothesis testing, this framework offers a unique solution to improving both the interpretability and performance of normative models in neuroimaging. The approach is scalable to other neuroimaging datasets, offering a versatile tool for broader biomedical applications.
Pounds, D.; Gupta, V.; TRIPATHI, H.; Neupane, S.
Show abstract
This paper focuses on forecasting minute-by-minute stress, anxiety, and affective states using wearable sensor data. It addresses mental health as a growing concern and the limitations of traditional assessment methods. A time-series machine learning framework was developed using electrodermal activity (EDA) and heart rate variability (HRV) features from the WESAD dataset. Models were trained and evaluated for minute-by-minute prediction of self-reported psychological states. Both classification (stress, anxiety) and regression models (affect) were explored comparing time-series and static approaches. Findings support the feasibility of real-time, personalized mental health monitoring using wearable devices and their potential for timely interventions in clinical or remote settings. The paper demonstrates how temporal modeling can enhance emotional state prediction and inform future research and system development.
Georgiou, G. P.; Paphiti, M.
Show abstract
Autism spectrum disorder (ASD) is a neurodevelopmental condition for which timely and accurate detection remains a major clinical priority. Early and reliable identification is important because it can facilitate access to assessment, diagnosis, and appropriate support; however, current diagnostic pathways still rely largely on behavioural evaluation and clinical judgement. In this context, machine-learning (ML) approaches have attracted growing interest because they can identify subtle and complex patterns in speech data that may not be easily captured through conventional methods. The current study capitalizes on this potential by developing and evaluating ML models for distinguishing autistic individuals from neurotypical individuals based on speech features. More specifically, acoustic features of vowels, including fundamental frequency (F0), first three formants (F1, F2, F3), duration, jitter, shimmer, harmonics-to-noise ratio (HNR), and intensity, were elicited from 18 autistic adults and 18 neurotypical adults through a controlled production task. Then, four supervised ML models were trained and evaluated on these features: LightGBM, Random Forest, Support Vector Machine, and XGBoost. All models demonstrated good classification performance, with the best-performing model achieving a strong discriminability of 89%. The explainability analysis identified F0 as the most influential predictor by a substantial margin, followed by intensity, F3, and F1, while duration, shimmer, HNR, jitter, and F2 contributed more modestly. These findings demonstrate that vowel acoustics contain clinically relevant information for distinguishing autistic from neurotypical adult speech and highlight the potential of interpretable, speech-based ML as a transparent and scalable aid for ASD screening and assessment.